A Declarative Framework for Constrained Clustering
نویسندگان
چکیده
In recent years, clustering has been extended to constrained clustering, so as to integrate knowledge on objects or on clusters, but adding such constraints generally requires to develop new algorithms. We propose a declarative and generic framework, based on Constraint Programming, which enables to design clustering tasks by specifying an optimization criterion and some constraints either on the clusters or on pairs of objects. In our framework, several classical optimization criteria are considered and they can be coupled with different kinds of constraints. Relying on Constraint Programming has two main advantages: the declarativity, which enables to easily add new constraints and the ability to find an optimal solution satisfying all the constraints (when there exists one). On the other hand, computation time depends on the constraints and on their ability to reduce the domain of variables, thus avoiding an exhaustive search.
منابع مشابه
Repeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملUn nouveau modèle pour la classification non supervisée sous contraintes
Constrained clustering is an important task in Data Mining. In the last ten years, many works have been done to extend classical clustering algorithms to handle user-defined constraints, but they are in general limited to one kind of constraints. In our previous work (Dao et al., 2013a), we have proposed a declarative and general framework, based on Constraint Programming, which enables to desi...
متن کاملConstrained Minimum Sum of Squares Clustering by Constraint Programming
The Within-Cluster Sum of Squares (WCSS) is the most used criterion in cluster analysis. Optimizing this criterion is proved to be NP-Hard and has been studied by different communities. On the other hand, Constrained Clustering allowing to integrate previous user knowledge in the clustering process has received much attention this last decade. As far as we know, there is a single approach that ...
متن کامل17 th European Conference on Machine Learning ( ECML ) and 10
Inductive databases (IDBs) represent a database view on data mining and knowledge discovery. IDBs contain not only data, but also generalizations (patterns and models) valid in the data. In an IDB, ordinary queries can be used to access and manipulate data, while inductive queries can be used to generate (mine), manipulate, and apply patterns. In the IDB framework, patterns become " first-class...
متن کاملOn the integration of declarative choreographies and Commitment-based agent societies into the SCIFF logic programming framework
The definition of choreography specification languages for Service Oriented Systems poses important challenges. Mainstream approaches tend to focus on procedural aspects, leading to over-constrained and over-specified models. Because of such a drawback, declarative languages are gaining popularity as a better way to model service choreographies. A similar issue was met in the Multi-Agent System...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013